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ABSTRACT 

We use an ensemble of N-body simulations of the currently favoured (concordance) 
cosmological model to measure the amount of information contained in the non-linear 
matter power spectrum about the amplitude of the initial power spectrum. Two sur- 
prising results emerge from this study: (i) that there is very little independent in- 
formation in the power spectrum in the translinear regime (k ~ 0.2-0.8 /iMpc^ 1 at 
the present day) over and above the information at linear scales and (ii) that the cu- 
mulative information begins to rise sharply again with increasing wavenumber in the 
non-linear regime. In the fully non-linear regime, the simulations are consistent with 
no loss of information during translinear and non-linear evolution. If this is indeed 
the case then the results suggest a picture in which translinear collapse is very rapid, 
and is followed by a bounce prior to virialization, impelling a wholesale revision of the 
HKLM-PD formalism. 

Key words: cosmology: theory - large-scale structure of Universe. 



1 INTRODUCTION 

Measurements of galaxy clustering play a key role in the 
quest to accur ately determine cosmological par ameters. As 
pointed out bv lEisenstein. Hu fc Teemarkl jl99ct 1999), con- 
straints from galaxy clustering directly complement those 
provided by the cosmic microwave background (CMB), en- 
abling the breaking of parameter degeneracies which af- 
fect both types of observation when considered in isola- 
tion. The power of combining datasets in this way has re- 
cently been demonstrated by several groups iSeliak et alJ 
120041: iTeemark et"ai]l2004 lEfstathiou et alJl2002t and the 
existence of a so-called 'concordance' cosmological model is 
largely thanks to this type of effort. 

On large scales, the power spectrum of galaxy cluster- 
ing traces the spectrum of primordial density fluctuations on 
those scales and is therefore a direct test of early-universe 
models. Many such models (including the simplest forms of 
the inflationary scenario) predict perturbations to the den- 
sity field that are Gaussian random distributed, and to date 
this ge neric prediction has r emained consistent with obser- 
vation jKomatsu et al. 2003). For Gaussian fluctuations the 
power spectrum contains all possible statistical information 
about the perturbations. 

At smaller scales, where much of the information in 
galaxy surveys lies, the extent to which the linear power 
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spectrum can be recovered from the non-linear power spec- 
trum rema ins unknown. The success of the HKLM-PD 
formalism dHamilton et al.l Il99lt iPeacock fc Doddsl I199A 
1996) in relating the non-linear power spectrum to the lin- 
ear one suggests that non-linear evolution preserves at least 
some of the information in the power spectrum, albeit trans- 
ported from larger to smaller comoving scales by the pro- 
cess of gravitational collapse. On the othe r hand, the simula- 
tions of Mciksi n. White fc Pea cock ( 1999;) indicate that non- 
linear evolution washes out baryonic wiggles in the power 
spectrum, suggesting that some information is perhaps irre- 
versibly destroyed. 

The purpose of the present paper is to investigate quan- 
titatively, using N-body simulations, whether or not non- 
linear evolution preserves information in the matter power 
spectrum. 



2 INFORMATION 

We measure the F isher information 

jTeemark. Tavlor fc Heavens! Il997l) 7 in the log of the 
amplitude A of the initial (post-recombination) matter 
power spectrum: 



d 2 ln£ 
9 In A 2 



(1) 
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Peacock & Dodds (1996) 
Smith et al. (2003) 
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Figure 1. Evolution of the non-linear power spectrum. Points 
with error bars show the mean power spectrum at three epochs 
(bottom to top: a = 0.5, 0.67 and 1) in the PM simulations. 
Open points are averages over the 256 h~ 1 Mpc simulations and 
filled points are the same for the 128 h -1 Mpc simulations. The 
error bars are statistically independent. The dashed error bar ex- 
tends beyond the bounds of the axes plotted. Stars are the results 
from the 25 higher resolution ART simulations (shown without 
error bars for clarity). The linear power spectrum is shown by 
the dotted curves in each panel. The solid and dashed curves are, 
respectively, from the fit ting formulae of lSmith et alJ l2003t) and 
iPeacock fe Dodds! Il996l) . The dot-dashed line marks the level of 
the shot noise in the 256 Mpc boxes. 



Here C denotes the likelihood, which for Gaussian flucta- 
tions takes the form C oc | C | 1 / 2 exp (— i<5C -1 <5), where <5 
is the observed data vector of fluctuations, and C is their 
expected covariance matrix. If fluctuations are statistically 
homogeneous and isotropic, then each Fourier mode 8k is in- 
dependently Gaussianly distributed. The variance of Fourier 
modes, the diagonal elements of the diagonal covariance ma- 
trix, constitute the power spectrum (|<5fc| 2 ) oc P(k). 

Thus, for Gaussian fluctuations, the likelihood depends 
on parameters only through the power spectrum P(k), and 
the information I defined by equation Q can be written 



Here 8k and its complex conjugate 8-k are counted as con- 
tributing two distinct modes, the real and imaginary parts 
of 8 k . 

At non-linear scales, we continue to define the infor- 
mation / in the non-linear power spectrum P(k) by equa- 
tion Clearly, there is a mapping from the initial lin- 
ear power spectrum Ph(k) to the non- linear power spectrum 
P(k): to find it, just do an Af-body simulation (the cosmic 
variance in the non-linear power spectrum should be negligi- 
ble if the simulation is large enough). On the other hand, it 
is not clear a priori whether an inverse mapping exists. If it 
does, then the Fisher information 7, equation @, in the non- 
linear power spectrum should equal that in the initial linear 
power spectrum: information is preserved. Conversely, if an 
inverse mapping does not exist, then the Fisher information 
in the non-linear power spectrum should be less than that 
in the initial power spectrum: non-linear evolution destroys 
information. 

The definition @ of information involves partial deriva- 
tives d In P(k) /d In A of the log of the non-linear power with 
respect to the log of the initial, linear amplitude. In an N- 
body simulation, increasing the initial amplitude is equiv- 
alent to evolving the simulation further in time. Thus the 
desired partial derivatives can be measured simply by com- 
paring the amplitudes of non-linear power P(k) at successive 
epochs. It is this property that makes the information in the 
log amplitude especially convenient to measure from simula- 
tions: there is no need to perform simulations with different 
values of cosmological parameters. 

The other factor in the definition (|5J of information 
is the second derivative of the log likelihood with respect 
to the log non-linear powers. This factor is the Hessian of 
the vector \xiP(k) of log non-linear powers, the expectation 
value of which is the Fisher matrix with respect to the log 
powers. Since each P(k) involves an expectation over many 
modes 8k, it is reasonable to invoke the central limit theo- 
rem to assert that estimates of power will be distributed in a 
Gaussian fashion about the expectation value, in which case 
this factor is approximately equal to the inverse of the co- 
variance matrix of power spectrum estimates. This assertion 
holds even if the density field itself is non-Gaussian. The re- 
liability of the approximation can be checked at linear scales 
(where there are fewest modes so the central limit theorem 
is least likely to apply), where the Fisher matrix should be 
diagonal, with diagonal elements equal to half the number 
of modes in each wavenumber bin. 



I = - 



dlnP(k) 



d 2 ln£ 



din P(k') 



din A d In P(k) din P(k') din A 



(2) 



For simplicity, in this paper we use results from simulations 
only at scales where the shot noise contribution to the power 
is subdominant, so that P(k) in equation can be regarded 
as the cosmic variance. 

During linear evolution, the partial derivatives of the 
log power with respect to log amplitude in equation are 
just unity, d In P(k)/d In A = 1. A short calculation from 
the Gaussian likelihood function shows that, for Gaussian 
fluctuations, the information I of equation equals half 
the number N of Gaussian modes: 



/ = N/2. 



(3) 



3 SIMULATIONS 

We generated 400 random realizations of a cubic region of 
the universe 256 Mpc on the side. A further 200 real- 
izations with a box size of 128 h" 1 Mpc were used to isolate 
numerical effects close to the mesh scale. T he cosmological 
model used was the 'vanilla-lite' model of iTegmark et alJ 
i2004l second-last column of their table 4): J7m = 0.29, J7a = 
0.71, baryon fraction /b = Slb/^M = 0.16, h — 0.71 and 
og = 0.97. The matter t ransfer function was ca lculated us- 
ing the fitting formula of Eiscnstci n fc Hull|l998i) . The boxes 
were evolved using a particle-mesh (PM) code with 128 3 par- 
ticles and a 256 3 force mesh. Adaptive time-stepping was 
used to control the force errors; a typical run required be- 
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Figure 2. Correlation matrices of estimates of the non-linear power spectrum for (a) the 256 h~ 1 Mpc boxes and (b) the 128 /i -1 Mpc 
boxes. Positive correlations are shown by completely filled bins while diamonds denote anti-correlations. Greyscale represents the mag- 
nitude of the correlations, ranging from (no correlation) to 1 (perfect correlation). 



tween 800 and 1400 steps, with the 128/i -1 Mpc boxes re- 
quiring more, on average, because of the higher degree of 
clustering. We also ran 25 realizations of a 128 h~ x Mpc box 
at hig her resolution using the adaptive mes h refinement code 
ART jKravtsov. Klvpin fc Khokhlovll99"7l) with a 128 3 root 
mesh and 3 levels of refinement. The initial conditions for 
these simulations were set up using the GRAFIC package 
with the same cosmological parameters as above. Although 
the small number of realizations is not sufficient to yield 
a precise estimate of the covariance matrix - we find that 
at least several hundred simulations are required to achieve 
convergence - they serve to confirm the results of the much 
larger set of PM simulations on small scales. 

The evolution of the non-linear power spectrum is il- 
lustrated by Fig. which shows the mean, shot noise- 
subtracted, power spectrum of the simulations at three 
epochs. The power spectrum was measured by calculating 
the density field on a 256 3 cubic mesh, using a cloud-in- 
cell scheme, and binning the resulting Fourier amplitudes in 
radial bins in fc-spa ce. For the ART simulations, 'chaining' 
jjenkins et al Jl998h was used to reach small scales. Smooth- 
ing due to the mass assignment scheme was corrected for 
by dividing by the square of the Fourier transform of the 
window function, prior to subtracting the shot noise contri- 
bution dSmith et al]l2003D . In the following analysis, we use 
only wavenumbers a factor of at least two away from the 
Nyquist frequency of the Fourier transform grid to avoid 
problems resulting from the uncertain mass assignment cor- 
rection and aliasing of power from smaller scales. The power 
spectra from the two different box sizes agree well up to ap- 
proximately 3 times the scale of the force mesh of the larger 
boxes, but at smaller scales the power in the PM simula- 
tions is systematically underestimated relative to the higher 
resolution ART simulations. The la tter agree well with fits 
to the results of ISmith et al.l i2003l) and lPeacock fc Doddsl 
(1996). The error bars in Fig. which are statistically in- 



dependent, are actually attached to the uncorrelated band 
powers described below. 

In Fig. |2]we plot the correlation coefficients between es- 
timates of log-power in each pair of wavenumber bins, mea- 
sured from the simulations. This is simply the covariance 
matrix scaled so that the diagonal elements are identically 
unity and has the advantage that, whereas the elements of 
the covariance matrix are smaller for larger simulation vol- 
umes, the correlation co-efficients are independent of box 
size. The results for the 200 128 h' 1 Mpc PM simulations 
and the 25 ART simulations are consistent, so we combine 
them into a single cov ariance matrix for this b ox size. Our 
results confirm those of lMeiksin fc White! 1^99), who found 
that (i) correlations between neighbouring band powers grow 
rapidly in the translinear regime, approaching 100% at non- 
linear wavenumbers and (ii) even for pairs of bands in which 
only one of the pair is non-linear there are significant cor- 
relations, of the order of 20-40%. The results from the two 
different box sizes are again broadly consistent. 

We assign information to each wavenumber bin by defin- 
ing a set of uncorrelated band-power estimates: 



B(k) 
P(k) 



S2.W{k,k') 



P(k') 
P(k') 



(4) 



(we use hats to denote measurements from individ- 
ual simulations and hatless symbols for averages over 
all simulations). W(k, k') is a decorrelation matrix 
(Ham ilton fc Tegmarkl |200(J| . Of the infinity of possible 
schemes for decorrelating the power spectrum, we use here 
the upper Cholesky decomposition of the Fisher matrix 
of the scaled power spectrum, suitably normalized, as our 
decorrelation matrix. We experimented with several decor- 
relation schemes and this was the only one that reproduced 
the expected amount of information in the linear regime, 
where it is reasonable to expect that information is con- 
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Figure 3. Cumulative information in the non-linear power spectrum at three epochs (a = 0.5, 0.67 and 1) as a function of (a) comoving 
and (b) physical wavenumber. Large symbols are points derived from the 256 h~ 1 Mpc boxes and small symbols are from the 128 h _1 Mpc 
boxes (PM+ART). The results for the 128/t —1 Mpc boxes have been shifted vertically by a factor 8 to account for the higher density of 
modes at a given comoving k. The d otted lines show the information in the linear power spectrum. The solid curves are the result of 
applying the lPeacock fc Do dds (1996) wavenumber scaling to the linear information. 



served. Mathematically, it is equivalent to taking a matrix 
composed of all the elements of the covariance matrix up to 
some wavenumber fc max , inverting this matrix and summing 
all the elements of the resulting Fisher matrix to arrive at 
a measure of the accumulated information, I(< fc max ), up 
to that wavenumber. Scaling both sides of equation Q by 
-P(fc) guarantees that the mean of the uncorrelated band 
power estimates at each wavenumber is equal to the mean 
power spectrum. It is the errors on the decorrelated band 
powers that are plotted in Fig. Notice that some of the 
band powers have error bars much larger than the actual 
scatter of the data points; this simply implies that those 
points contain almost no independent information. 



4 DISCUSSION 

Fig. |3] presents the essential results of this paper. It shows 
the cumulative information as a function of both comov- 
ing and physical wavenumber at three epochs. This plot 
reveals two striking and unexpected features. The first is 
how fiat the cumulative information becomes in the translin- 
ear regime (fc ~ 0.2-0.8 /iMpc" 1 at a = 1). The flatness 
appears consistently at all epochs, in both box sizes and 
using two different simulation codes. The results from the 
25 ART simulations are consistent, within the scatter, with 
the larger suite of 128 Mpc PM simulations so we have 
combined the results in a single curve for this box size, us- 
ing the power spectrum from the ART simulations, which 
are more accurate on small scales, to calculate the partial 
derivatives d In P(fc)/<91n A in equation (0. There are two 
possible reasons for the flatness of cumulative information: 
either information is being lost from the power spectrum 
rather abruptly as structures enter the translinear regime 
and begin to collapse, or else information is flowing rapidly 



from large to small scales. The HKLM-PD formalism pre- 
dicts that information should indeed flow from large to small 
scales as structures collapse but, as Fig. [3]shows, if informa- 
tion is preserved then the flow of information is far more 
rapid than predicted by the PD formula. 

The second remarkable feature of the curves in Fig. [3] 
is that in the highly non-linear regime the cumulative in- 
formation begins to rise sharply again (at k ~ 0.8/iMpc -1 
for a = 1). This upturn occurs consistently at all epochs, 
in both box sizes and with both codes, making it unlikely 
to be a numerical effect in the simulations (which ought, at 
the very least, to scale with the box size). The upturn is 
difficult to explain if information in the power spectrum is 
completely destroyed during translinear collapse. The pos- 
sibility remains that during translinear evolution informa- 
tion is temporarily diverted out of the power spectrum into 
higher order moments and that it is somehow restored into 
the power spectrum at non-linear scales, but this explana- 
tion seems contrived and we do not explore it further. 

The definitive test for whether information is being de- 
stroyed is to look at the cumulative information in the highly 
non-linear regime. At highly non-linear scales, structures 
virialize and therefore cease to collapse rapidly. The HKLM- 
PD formalism assumes the stable clustering hypothesis: that 
following virialization structures remain of fixed size in phys- 
ical co-ordinates. Probably, the assumption of stable clus- 
tering is not prec isely true (e.g. iPadmanabhan et al.lll996l : 
ISmith et all2003l) . Nevertheless, the collapse or expansion of 
structures in the virialized regime is much slower than the 
rapid dynamic collapse that takes place in the translinear 
regime. 

It follows that, if translinear and non-linear evolution 
preserve information in the power spectrum, then the cumu- 
lative information at a given fixed physical scale in the highly 
non-linear regime should remain constant. In other words, 
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the cumulative information at different epochs, plotted as 
a function of physical wavenumber k/a, should asymptote 
to a single common line in the highly non-linear regime, 
as it does in the PD formula. Conversely, if evolution de- 
stroys information, then the cumulative information at a 
fixed non-linear physical scale should decrease systemati- 
cally with time. Fig. &b) shows that shows that the cu- 
mulative information in the non-linear power spectrum at 
the smallest physical scales that can be reliably measured, 
is consistent, within the uncertainties, with no loss of infor- 
mation. What is more surprising is that on small scales, the 
cumulative information at the two later epochs exceeds that 
at the same physical scale at a = 0.5, although there are in- 
dications that all three curves will eventually asymptote to 
the same value in the highly non-linear regime. This tempo- 
rary increase of information suggests that structures bounce 
back prior to virialization. At a = 1, for example, struc- 
tures at k ~ 0.2-0.8 /iMpc -1 are in rapid collapse, following 
turnaround. At smaller scales, structures have bounced and 
are actually expanding, before becoming fully virialized at 
k ~ 3 ftMpc" 1 . 

The simulations presented here are consistent with the 
hypothesis that non-linear evolution largely preserves infor- 
mation in the power spectrum. If this is true, then a whole- 
sale revision of the HKLM-PD formalism is needed. The 
simulations suggest rapid translinear collapse followed by 
a bounce and subsequent virialization, in contrast to the 
rather gentle behaviour predicted by the HKLM-PD formal- 
ism. 

This rapid collapse can be construed as supporting the 
alternative picture of non-linear evolution put forward by 
the halo mo del Ce.g. lMa fc Frvl2000t|p"eacock fc Smithl2000t 
ISeliakll200Cri . In the halo model, the cosmic density field is 
treated as a set of discrete, collapsed dark matter haloes, 
whose centres are clustered according to linear theory. The 
non-linear contribution to the matter power spectrum is de- 
termined by the density profiles of the individual haloes. 
Implicit in this picture is the assumption that the transition 
between the linear and virialized regimes is an abrupt one. 
Our direct measurements of the flow of information from 
large to small scales confirm that this indeed appears to be 
the case. 

The results reported in this paper have several other 
practical implications beyond those relevant to analytic 
models of non-linear evolution. Firstly, if non-linear evolu- 
tion completely preserves information in the power spec- 
trum, then information about baryonic wiggles is preserved. 
Different and better simulations than carried out for this pa- 
per will be necessary to test what actually happens to bary- 
onic wiggles. Secondly, if the translinear collapse to small 
scales is indeed as rapid as suggested by the simulations in 
this paper, then baryonic wiggles should be stretched over 
translinear scales much more than had previously been an- 
ticipated. Thirdly, if the translinear power spectrum con- 
tains little additional information over and above that in 
the linear power spectrum, then efforts to measure power at 
translinear scales may not be as rewarding, in the sense of 
refining estimates of cosmological parameters, as might have 
been anticipated. It should be emphasized that one should 
not interpret our results as implying that the translinear 
regime contains no information; rather, the information in 



the translinear regime is degenerate with that in the linear 
regime. 

In future work it will be interesting not only to test the 
results of the present paper at smaller scales, but also to 
investigate the extent to which non-linear evolution does 
or does not preserve information about other cosmologi- 
cal quantities, such as the primordial spectral index, or the 
baryon fraction. 
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